SemanticScuttle - klotz.me » Tags: python+reinforcement learning

Tags: python* + reinforcement learning*

0 bookmark(s) - Sort by: Date ↓ / Title /

Data-Science-Espresso/Reinforcement-Learning-TicTacToe

This is a GitHub repository for a Reinforcement Learning Tic Tac Toe project. It contains a single Python file, TicTacToeRL.py. The repository has 0 stars and 0 forks as of the current data.

2025-05-28 Tags: reinforcement learning, tic tac toe, python, github, machine learning, q learning by klotz

Training Large Language Models with Interpreter Feedback using WebAssembly

This article details a method for training large language models (LLMs) for code generation using a secure, local WebAssembly-based code interpreter and reinforcement learning with Group Relative Policy Optimization (GRPO). It covers the setup, training process, evaluation, and potential next steps.

2025-04-04 Tags: huggingface, llm, training, code generation, webassembly, wasm, grpo, reinforcement learning, axolotl, code interpreter, fine-tuning, python by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: python* + reinforcement learning*

Linked Tags

Related Tags